Metadata

Close
Metadata

%0 Conference Proceedings
%4 sid.inpe.br/sibgrapi/2018/08.24.16.12
%2 sid.inpe.br/sibgrapi/2018/08.24.16.12.54
%@doi 10.1109/SIBGRAPI.2018.00030
%T Multicenter Imaging Studies: Automated Approach to Evaluating Data Variability and the Role of Outliers
%D 2018
%A Bento, Mariana,
%A Souza, Roberto,
%A Frayne, Richard,
%@affiliation University of Calgary
%@affiliation University of Calgary
%@affiliation University of Calgary
%E Ross, Arun,
%E Gastal, Eduardo S. L.,
%E Jorge, Joaquim A.,
%E Queiroz, Ricardo L. de,
%E Minetto, Rodrigo,
%E Sarkar, Sudeep,
%E Papa, João Paulo,
%E Oliveira, Manuel M.,
%E Arbeláez, Pablo,
%E Mery, Domingo,
%E Oliveira, Maria Cristina Ferreira de,
%E Spina, Thiago Vallin,
%E Mendes, Caroline Mazetto,
%E Costa, Henrique Sérgio Gutierrez,
%E Mejail, Marta Estela,
%E Geus, Klaus de,
%E Scheer, Sergio,
%B Conference on Graphics, Patterns and Images, 31 (SIBGRAPI)
%C Foz do Iguaçu, PR, Brazil
%8 29 Oct.-1 Nov. 2018
%I IEEE Computer Society
%J Los Alamitos
%S Proceedings
%K multicenter MR data, outlier detection, data variability.
%X Magnetic resonance (MR) as well as other imaging modalities have been used in a large number of clinical and research studies for the analysis and quantification of important structures and the detection of abnormalities. In this context, machine learning is playing an increasingly important role in the development of automated tools for aiding in image quantification, patient diagnosis and follow-up. Normally, these techniques require large, heterogeneous datasets to provide accurate and generalizable results. Large, multi-center studies, for example, can provide such data. Images acquired at different centers, however, can present varying characteristics due to differences in acquisition parameters, site procedures and scanners configuration. While variability in the dataset is required to develop robust, generalizable studies (i.e., independent of the acquisition parameters or center), like all studies there is also a need to ensure overall data quality by prospectively identifying and removing poor-quality data samples that should not be included, e.g., outliers. We wish to keep image samples that are representative of the underlying population (so called inliers), yet removing those samples that are not. We propose a framework to analyze data variability and identify samples that should be removed in order to have more representative, reliable and robust datasets. Our example case study is based on a public dataset containing T1-weighted volumetric head images data acquired at six different centers, using three different scanner vendors and at two commonly used magnetic fields strengths. We propose an algorithm for assessing data robustness and finding the optimal data for study occlusion (i.e., the data size that presents with lowest variability while maintaining generalizability (i.e., using samples from all sites)).
%@language en
%3 57_manuscript.pdf